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ABSTRACT 


One  of  the  most  crucial  problems  in  theoretical  and  applied  statistics  is  to 
determine  the  precision  of  the  estimates  produced  by  different  statistical  estimators. 
This  problem  is  greatly  increased  when  the  population  parametric  characteristics  are 
not  known.  Parallel  to  this  problem  is  that  of  deciding  how  large  (or  small)  the  sample 
population  must  be  in  order  to  obtain  a  desired  precision  within  certain  range. 

There  are  several  non-parametric  methods  to  approach  the  first  problem.  The 
BOOTSTRAP  Method  {Efron,  1979)  is  one  of  these  approaches  and  the  one  of  interest 
in  this  thesis.  With  this  method,  one  could  improve  the  precision  of  the  estimates  and 
gain  information  about  the  distributional  characteristics  of  statistical  estimators.  The 
bootstrap  method  has  been  amply  compared  with  other  methods;  the  results  show  that 
the  bootstrap  method  often  produces  more  precise  estimates  (i.e.  with  smaller  mean 
squared  error)  than  competitors  such  as  the  JACKN1FE,  SECTIONING  and 

CROSS-VALIDATION.  However,  the  results  that  have  been  obtained  are  based  on 

a  /"' 

large  sample  sizes  and  large  numbers  of  "bootstrap '  replications. 

This  thesis  analyzes  the  behavior  of  the  BOOTSTRAP  method  when  the  number 
of  bootstrap  replications  is  small.  It  tries  to  identify  any  tradeoffs  between  sample  size 
and  the  number  of  bootstrap  replications  required  to  attain  a  desired  precision  in  the 
estimates  produced  in  several  particular  situations.  One  of  the  goals  is  to  produce 
graphical  displays  that  will  indicate  to  the  experimental  statistician  the  price  that  must 
be  paid  in  the  precision  of  the  estimates,  obtained  with  the  bootstrap  method,  when 
sample  size  is  small,  and  the  number  of  bootstrap  replications  to  use  in  this  situation. 


4 


TABLE  OF  CONTENTS 


I.  INTRODUCTION . 

A.  BACKGROUND  . 

B.  THE  GENERAL  PROBLEM  . 

C.  ORGANIZATION . 

II.  THE  BOOTSTRAP  METHOD  ... . . 

A.  A  DESCRIPTION  OF  THE  METHOD 

1.  Direct  Analytical  Calculations  . 

2.  Monte  Carlo  Simulation . 


III. 


IV. 


APPLICATION  OF  THE  BOOTSTRAP  METHOD  :  SOME 
RESULTS  . 

A.  THE  MEAN,  VARIANCE  AND  THE  COEFFICIENT  OF 

VARIATION  OF  EXPONENTIAL  RANDOM 
VARIATES . 

B.  THE  SAMPLE  VARIANCE . 

C.  THREE  DIFFERENT  ESTIMATORS  FOR  THE 

VARIANCE . 

D.  THE  CENTER  OF  A  DISTRIBUTION:  COMPARISON 
OF  THE  MEAN,  MEDIAN  AND  TRIMMED  MEAN  .  . . 

E.  LINEAR  REGRESSION  BY  BOOTSTRAPING  THE 

RESIDUALS  . 

CONCLUSIONS  . 


APPENDIX  A: 
APPENDIX  B: 
APPENDIX  C: 


LIST  OF  SPECIAL  NOTATIONS  . 

FORTRAN  CODE  FOR  BOOTSTRAPING  . .  . 

MSE*h  OF  SOME  ESTIMATORS  USING  THE 
BOOTSTRAP  METHOD  . 


LIST  OF  REFERENCES . 

INITIAL  DISTRIBUTION  LIST 


LIST  OF  TABLES 


ASYMPTOTIC  VARIANCE  OF  THE  MEAN,  MEDIAN  AND 
5%  TRIMMED  MEAN  .• . 


LIST  OF  FIGURES 


3.1  MSE*h  of  Bootstrap  Sample  Mean:  Exp(l) . 21 

3.2  MSE*h  of  Bootstrap  Sample  Variance:  Exp(l)  . 21 

3.3  MSE*h  of  Bootstrap  CoefF.  of  Variation:  Exp(l) . 22 

3.4  Bootstrap  Dist.  of  Sample  Mean  B  =  5  . 23 

3.5  Bootstrap  Dist.  of  Sample  Variance  B  =  5  . 25 

3.6  MSE*h  of  Bootstrap  Sample  Variance  of  a  G(0. 5,1)  . 26 

3.7  MSE*h  of  Bootstrap  Sample  Variance  of  a  N(0,1) . 26 

3.8  MSE*h  of  Bootstrap  Sample  Variance  of  a  L(0,1)  . 27 

3.9  MSE*h  of  the  Sample  Variance  of  a  N(0,1)  . 30 

3.10  MSE*h  of  the  2nd  Variance  Estimator  of  a  N(0,1) . 30 

3.11  MSE*h  of  the  3rd  Variance  Estimator  of  a  N(0,1) . 31 

3.12  Asymptotic  MSE  of  the  Sample  Mean  of  a  N(0,1) . 33 

3.13  Asymptotic  MSE  of  the  Sample  Median  of  a  N(0,I) . 33 

3.14  Asymptotic  MSE  of  the  Sample  5%  Trimmed  Mean  of  a  N(0,1) . 34 

3.15  Asymptotic  MSE  of  the  Sample  Mean  of  a  L(0,1) . 34 

3.16  Asymptotic  MSE  of  the  Sample  Median  of  a  L(0,1)  . 35 

3.17  Asymptotic  MSE  of  the  Sample  5%  Trimmed  Mean  of  a  L(0,1)  . 35 

3.18  Estimated  Averages  MSE  of  ph  . 39 

C.l  MSE*h  of  the  Estimators  for  Exp(l)  . 51 

C.2  MSE*h  of  S2 . 52 

C.3  MSE*h  of  1S*Z  ,  2S*Z  and  of  3S*Z . 53 

C.4  Bootstrap  Dist.  of  Sample  Mean  B=  150  . 54 

C.5  Bootstrap  Dist.  of  Sample  Variance  B  =  150  . 55 


I.  INTRODUCTION 


A.  BACKGROUND 

One  of  the  most  common  problem  in  applied  statistics  is  the  estimation  of  an 
unknown  parameter  0.  Once  the  statistician  has  decided  on  the  model  having  one  or 
more  parameters  to  be  estimated  and  has  selected  the  estimator  (i.e.,  m.l.e.,  least-square 
estimator,  etc.)  that  will  be  used  to  obtain  the  estimates,  the  second  problem  that  he  or 
she  faces  is  how  to  estimate  the  accuracy  of  these  estimates.  There  are  several  ways  of 
measuring  the  accuracy  or  the  error  of  statistical  estimators.  In  this  thesis,  the  measure 
of  statistical  error  will  be  defined  to  be  the  mean  squared  error  (MSE)  of  the 
estimators;  i.e.  the  variance  plus  the  bias-squared  of  0h  (where  0h  represents  the 
estimator  of  the  parameter  0.  In  Appendix  A  the  reader  will  find  a  list  of  special 
notations  used  in  this  thesis)  : 

MSE(0h)  =  E[(0h  -  0)2]  =  Var(0h)  +  [BIAS(0h)]2  (1.1) 

When  the  practitioner  is  dealing  with  samples  obtained  from  populations  for 
which  the  distributional  characteristics  arc  known,  classical  statistical  theory  provides 
an  answer  to  the  second  problem  that  the  statistician  faces.  This  is  true  since,  at  least 
in  theory,  the  variance  and  the  bias  of  most  statistical  estimators  can  be  calculated 
analytically.  However,  the  difficulty  of  analytically  deriving  the  MSE  of  some  statistical 
estimator  increases  as  the  mathematical  definition  of  the  estimator  becomes  more 
complicated.  When  this  is  the  case  or  when  the  practitioner  docs  not  actually  know  the 
probability  distribution,  say  F,  from  which  the  sample  w'as  obtained,  then  the  MSE  of 
the  estimators  must  be  estimated. 

There  arc  several  non-paramctric  methods  for  estimating  the  bias  and  the 
variance  of  an  estimator  of  interest.  The  most  common  ones  are  the  Quenoille-Tukey 
JACKNIFE  method,  CROSS-VALIDATION,  and  SECTIONING;  the  Jacknife  being 
the  most  commonly  used  of  the  three  approaches.  Efron  and  Gong  [Ref.  1]  and  Miller 
[Ref.  2]  provide  an  excellent  exposition  of  the  first  two  methods  and  Lewis  gives  a  good 
introduction  and  analysis  of.the  later  (See  [Ref.  3] ). 


In  recent  years,  Efron  [Refs.  1,4],  has  developed  another,  rather  intriguing 
non-parametric  methodology  for  estimating  the  MSE  of  any  statistic.  This  method, 
called  the  BOOTSTRAP,  is  simple  and  has  been  shown  by  Efron  to  be  a  powerful 
statistical  tool  that  can  be  applied  even  in  complex  situations  (See  Efron,  [Ref.  5]  and 
[Ref.  6]  ).  This  method,  as  shown  in  this  thesis,  is  a  good  approach  for  estimating  the 
precision  of  a  statistical  estimator  used  in  a  given  model.  It  also  gives  information 
about  the  distributional  characteristics  of  the  estimator  used.  Efron  and  Gong  [Ref.  1] 
and  Tibshirani  [Ref.  7]  have  conducted  intensive  analyses  of  this  new  method  and  have 
compared  it  with  the  other  non-parametric  methods  mentioned  above.  Surprisingly  for 
some  authors,  the  BOOTSTRAP  has  been  shown  to  produce  estimates  with  much 
more  precision  (sometimes  up  to  twenty  percent  lower  variance,  for  example)  than  the 
JACKNIFE  and  CROSS-VALIDATION  estimators.  As  an  example,  Efron  [Ref.  4: 
Section  3],  has  shown  that  the  BOOTSTRAP  methodology  correctly  estimates, 
asymptotically,  the  variance  of  the  sample  median,  a  case  where  the  JACKNIFE  is 
known  to  fail.  As  in  the  case  of  the  sample  median,  it  is  known  that  the  JACKNIFE 
collapses  for  non-smooth  statistics;  however,  the  BOOTSTRAP  seems  to  produce 
accurate  estimates  even  in  these  cases. 

B.  THE  GENERAL  PROBLEM 

Suppose  that  the  realization  xx  ,  xz  ,  .  .  .  ,  xn  of  a  random  sample  X1  ,  Xz  ,  .  .  .  , 
Xn  has  been  observed,  and  that  Xx  ,  X£  ,  ....  Xn  are  independent  and  identically 
distributed  (i.i.d.),  having  a  probability  distribution  F.  In  practice,  the  distribution  F  is 
probably  unknown  and  the  problem  is  to  estimate  the  value  of  some  parameter  of 
interest,  such  as  the  mean,  variance,  or  median.  This  is  done  using  a  sample  of  size  n 
with  some  estimator  of  0(F),  say  0h(F).  The  basic  idea  of  the  BOOTSTRAP  method  is 

very  simple,  at  least  in  principle:1  having  observed  x^  ,  xz . xn  construct  the 

sample  empirical  probability  distribution ,  Fh,  by  putting  mass  I/n  at  each  observation  ^ 
,  xz  ,  .  .  .  ,  xn  .  Now,  fixing  Fh  ,  draw  a  random  sample  of  size  n  with  replacement  from 
Fh  .  This  sample  will  be  called  a  bootstrap  random  sample  and  will  be  denoted  by 

X*  =  (  X*,  ,  X*? . X*n  )  (1.2) 


’The  BOOTSTRAP  methodology  will  be  analyzed  in  more  detail  in  Chapter  2. 
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and  then  X*-~  iid  ^  '  Then  the  task  is  to  estimate  the  distribution  of  0(F)  by  the 

distribution  of  0*(Fh),  where  0*(Fh)  denotes  the  value  of  the  parameter  of  interest 

based  on  the  bootstrap  mechanism.  This  mechanism  proceeds  as  follows  :  keeping  Fh 

fixed,  draw  a  bootstrap  sample  and  calculate  0  (Fh);  do  this  a  large  number  B  of  times 

obtaining  0*j(Fh),  0+2(Fh),  .  .  .  ,  0*g(Fh).  The  resultant  (sample)  distribution  of  0  is 

called  the  bootstrap  distribution  Fh  .  Once  Fh  is  obtained,  then  any  specific  feature 

of  this  distribution,  such  as  expected  value  of  0*  ,  E*(0*)  or  the  variance  of  0*  , 
*  *0  * 
Var*  (0  ),  could  be  obtained.  (In  this  thesis,  notation  like  "E*  ",  "Var*  ",  "S  ",  "X  " 

★ 

,  etc.,  indicates  calculations  relating  to  the  conditional  bootstrap  distribution  of  X  ,  with 
the  vector  of  random  variates  X  and  hence  Fh  ,  fixed.2  ).  Theoretically,  then,  the 
bootstrap  idea  could  be  used  to  estimate  the  expected  value,  the  variance,  and  the 
mean  squared  error  of  any  estimator,  given  a  sample  that  comes  from  an  unknown 
probability  distribution  F. 

As  mentioned  earlier,  Efron  (See  [Ref.  4]  )  has  shown  that  this  method  is  often 
more  precise  than  other  non-parametric  methods  for  assessing  statistical  accuracy. 
However,  the  experimentation  done  in  the  past  using  this  method  relied  on  a  large 
number  B  of  bootstrap  replications;  i.e,  a  large  sample  on  0*.  In  some  cases,  it  can  be 
shown  (see  Chapter  2,  for  the  case  of  Var*(0+))  that  as  B  -»oo,  the  variance  of  0* 
based  on  Fh  is  equal  to  the  variance  of  the  estimator  0  based  on  F  .  But,  how  large 
must  B  be  in  order  to  obtain  estimates  that  are  accurate  or  to  obtain  estimators  with  a 
small  MSE  is  a  question  to  be  answered.  Also,  what  is  the  tradeofT  between  the 
sample  size  n  and  the  number  B  of  bootstrap  replications  ? 

The  purpose  of  this  thesis  is  then  twofold  :  first,  to  analyze  the  bootstrap 
performance  as  the  number  B  of  replications  increases,  starting  from  a  small  B.  The 
second,  also  of  great  interest,  is  to  study  the  relationship  between  the  sample  size  n  and 
the  number  B  in  the  estimation  of  the  MSE  of  the  estimator  using  the  bootstrap 
mechanism. 

C.  ORGANIZATION 

There  are  several  methods  of  dertermining  the  bootstrap  distribution  of  an 
estimator  0  (Fh),  two  of  which  will  be  analyzed  in  this  thesis.3  The  first  is  by  direct 

2As  it  will  be  shown  in  the  next  chapter,  this  is  a  critical  feature  of  the 
BOOTS  I  RAP  method:  the  vector  of  random  variates  X  and  I  must  be  fixed  through 
the  process. 

3A  third  method  involves  making  Taylor  scries  expansion  to  obtain  the 


theoretical  calculations  (this  is  usually  the  most  difficult  approach).  The  second  relies 

on  Monte  Carlo  approximations  to  the  bootstrap  distribution:  repeated  realizations  of 
%  ,  *1  *2 
X  are  generated  by  taking  random  samples  of  size  n  from  F  ,  say  x  ,  x  ,  .  .  .  , 

♦  , 

x  and  the  histogram  of  the  corresponding  values  0  j(F")  ,  0  2(Fh)  ,  0  g(Fh)  is 

constructed  as  an  approximation  to  the  actual  bootstrap  distribution  (See  [Ref.  1: 
Section  2]  ).  These  two  methods  are  of  interest  in  the  second  chapter.  In  the  last 
section  of  Chapter  Two,  the  different  statistical  experiments  conducted  for  this  thesis 
are  explained  in  detail.  In  Chapter  Three,  the  results  from  these  experiments  are 
presented  and  analyzed,  and  the  problem  of  using  the  bootstrap  approach  in  linear 
regression  problems  is  also  discussed.  Conclusions  are  presented  in  the  last  chapter. 
There,  one  of  the  points  of  interest  is  to  discuss  the  main  disadvantage  of  the  bootstrap 
methodology  :  the  computer  time  required  to  implement  this  method  when  Monte 
Carlo  simulation  is  used.  In  Appendix  B,  the  FORTRAN  software  that  was  designed 
to  run  the  experiments  discussed  in  this  thesis  will  be  explained  and  the  code  is  listed. 
This  computer  program  is  user  friendly  and  can  be  used  to  estimate  the  bootstrap 
distribution  of  eight  different  estimators.  Finally  in  Appendix  C,  the  reader  can  see 
some  tables  that  give  a  good  idea  about  how  large  (or  small)  B  and  n  can  be  in  order 
to  obtain  a  desired  precision  on  the  estimates  of  parameters  of  given  populations  F. 


E-* 


K 


l\ 


approximate  mean  and  variance  of  the  bootstrap  distribution  F  .  See  Rcf.4,  Section  5. 
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II.  THE  BOOTSTRAP  METHOD 


A.  A  DESCRIPTION  OF  THE  METHOD 

As  mentioned  earlier,  the  Bootstrap  methodology  is,  in  principle,  simple.  Also, 
recall  that  in  this  thesis  the  problem  of  interest  is  to  study  how  this  method  performs  in 
estimating  the  MSE  of  some  statistical  estimators,  and  how  the  MSE  behaves  as  the 
number  B  of  bootstrap  replications  and  the  sample  size  n  change. 

Suppose  that  the  data  of  interest  consist  of  a  random  sample  X  =  (Xt  ,  X2  ,  .  .  . 
,  Xn  )  of  size  n,  from  an  unspecified  probability  distribution  F  on  the  real  line.  The  Xj 
may  be  real  valued,  two  dimensional,  or  take  values  in  a  more  complicated  space,  but 
this  will  not  affect  the  theory,  see  Efron  [Ref.  2].  Thus,  it  is  assumed  that 

X1,X2,...,Xn~iidF.  (2.1) 

The  problem  is  now  to  estimate  the  probability  distribution  of  a  specific  estimator  of  a 
parameter  0(F),  say  0h(F).  The  probability  distribution  of0h(F)  could  be  approximated 
by  the  following  algorithm  (See  Efron  [Ref.  1:  Section  2  ] )  : 

(1)  given  that  the  realization  of  X  has  been  observed,  say  Xj  =  x-,  i  =  1,  2,...,  n, 

(2)  construct  the  sample  probability  distribution  Fh  ,  by  putting  mass  1/n  at  each 
point  xx  ,  x2  .....  xn  , 

(3)  keeping  Xj  and  Fh  fixed,  draw  with  replacement  a  random  sample  of  size  n 
from  Fn  ,  and  call  this  the  bootstrap  sample;  i.e.,  X  j  =  x  j,  where  X  j  ~  jjd 
Fh  ,  so 

P(X*j  =  Xj  |  X  =  x  )  =  1/n  ,  (2.2) 

(4)  the  distribution  of0h(F)  can  be  approximated  by  a  sample  on  0^( Frh);  then,  a 

*  +  . 

measure  of  accuracy  could  be  assigned  to  0  (F)  base  on  0  (F  ). 

As  mentioned  earlier,  the  distribution  of  some  estimators  0*(Fh)  might  be 
calculated  analytically. 
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1.  Direct  Analytical  Calculations 

An  attempt  is  now  made  to  calculate  some  parameters  of  interest  of  the 
* 

distribution  of  X  •.  Assuming  the  conditions  shown  in  expressions  (2.1)  and  (2.2),  the 
expected  value  of  X  •  ,  given  X,  could  be  calculated  as  follows : 

E*(X*j )  =  E(X*j  I  X  -  x)  =  £j  Xj  P(X*j  =  Xj  |  X  =  x) ,  (2.3) 

where  j  =  1,  2,...,  n.  From  (2.2),  this  is  equal  to  : 

E*(X*j)=  Ij(xj/n)=X  j=  1,  2,...,  n  ,  (2.4) 

which  is  the  sample  mean  of  the  original  sample  X.  Then  from  (2.4),  the  unconditional 
♦ 

expected  /alue  of  X  •  is  : 

E  (X*p  =  E[E*(X*j  |  X)]  =  E(X)  =  Mx  j=  1.  2 n  .  (2.5) 

* 

Thus,  the  unconditional  expectation  of  X  j  is  equal  to  the  mean  of  the  population 
from  which  the  original  sample  was  obtained.  (Note,  from  this  point  on  all  summation 
signs  go  from  I  to  n,  unless  otherwise  specified,  and  E*  ,  Var*  ,  etc.,  are  conditional, 
give  X  .) 

Likewise,  the  unconditional  variance  of  X  could  be  derived  from  the 
♦ 

conditional  variance  of  X  : 

Var*(X*j )  =  E*l(X*i  -  E(X^  |  X  =  x))z  )  .  (2.6) 

Using  (2.5)  this  expression  is  equivalent  to  : 

Var*(X*-  )  =  E[(X*j  -  X)2  I  X  ]  (2.7) 

=  E*(X*2i )  -  X2 
=  Sj  (X2i  jn)  -  X2 
=  Si(Xi-X)2/n 

By  definition  of  the  sample  variance,  S2X  ,  then 

Var*(X*j )  =  (n-l)/n  S2X  (2.8) 
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Now,  unconditionally 


Var  (X*j)  =  E(Var*(X*))  +  Var[E*(X*)]  (2.9) 

=  E  [£j  (X2j  /  n  )  -  X2  ]  +  Var(X) 

=  E[(n-l)/nS2x]  +  02x/  n 
=  (n-l)/n  E(S2X)  +  <t2x  /  n 
=  (n-l)/n  <t2x  +  0zx/n 


Therefore,  the  variance  (unconditional)  of  X  j  is  the  same  as  the  variance  of 
X-.  The  covariance  between  X  j  and  X  :  has  a  very  important  impact  on  the 
bootstrap  methodology,  primarily  when  the  bootstrap  distribution  of  6  j(Fh)  is 

approximated  by  Monte  Carlo  simulation  (see  next  section). 

*  * 

Conditionally  (given  X),  the  covariance  between  X  •  and  X  j  is  as  follows  : 
Cov*(X*i,X*j  )  =  E*[(X*j  -  E*(X*;))  (X*j  -  E*(X*j))  )  .  (2.10) 


From  (2.5),  this  is 

Cov*(X*j,X*j  )  =  E*[(X*j  -  X)  (X*j  -  X)  ]  (2. 1 1) 

*  E*(X*j  X*:  )  -  X2 

*  *  * 

Now  conditionally,  given  X  =  x,  the  joint  distribution  of  (X  -,X  :)  is  uniform  over  the 
points  (xvar2,...,  xn)  x  (j^,^,...,  xn )  and  this  implies  that  (X  ■  X  j)  =  (x^Xj)  with 
probability  I/n2.  Then 

E*(X*i  X*p  =  XiZj  (xi  X|  )  /  n2  i  *  j  (2.12) 

=  (1/n2)  *j)2  =  x2. 

Finally,  the  conditional  covariance  between  X  and  X  j  is 

Cov*(X*j,X*j)  =  X2  -  X2  =  0  .  (2.13) 

4  £ 

Now,  to  derive  the  unconditional  covariance  between  X  •  and  X  j,  it  will  be  convenient 
to  use  the  result  obtained  in  equation  (2.13).  To  use  (2.13),  it  must  be  shown  that  the 
following  equality  holds: 
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(2.14) 


Cov(X*i,X*j)=  E[Cov*(X*i,X*j)]  +  Cov[E*(X*j),  E*(X*j)]. 

To  show  this,  notice  that  the  conditional  covariance  can  be  defined  as 

Cov(X,Y|Z)  =  E(x  y|z)[(XY  -  E(X|Z)E(Y|Z))|Z]  (2.15) 

=  E(x  y|z)  (XY|Z)  -  [E(X|Z)E(Y|Z)]  . 

Then 

Ez[Cov(X,Y|Z)]  =  Ez[E{x>y|z)(XY|Z)  -  { E(X |Z)E( Y|Z)} ]  (2.16) 

=  Ez[E(x  y|z)(XY|Z)]  -  (Ez[E(X|Z)]Ez[E(Y|Z)]}  - 
-  Ez[E(X|Z)E(Y|Z)]  +  (Ez[E(X|Z)]Ez[E(Y|Z)]} 

=  Cov(X,Y)  -  Cov[E(X|Z),E(Y|Z)]. 

Therefore, 

Cov(X,Y)  =  Ez[Cov(X,Y|Z)]  +  Cov[E(X|Z),E(Y|Z)].  (2.17) 

With  this  in  mind,  the  unconditional  covariance  could  finally  be  computed  by  using 
(2.15).  Now,  the  portion  inside  the  brackets  of  the  first  term  of  the  right  hand  side  of 
equation  (2.14)  was  shown  in  (2.13)  to  be  equal  to  zero.  Then,  using  expression  (2.5), 
equation  (2.14)  reduces  to- 

Cov(X*i,X*j)  =  Cov(X,X)  =  Var(X)  =  <y2x/n  ,  (2.18) 


and  from  (2.18),  the  correlation  coefficient  is  given  by 

P(X*j,X*j)  =  1/n  =  P[X*-  =  Xj]  (2.19) 

Comparing  equations  (2.13)  and  (2.18)  it  could  then  be  stated  that  the 
bootstrap  samples  are  (conditionally)  independent  as  long  as  X  is  held  fixed. 

It  is  possible  now  to  derive  the  distributional  characteristics  of  some  statistical 

jft 

estimators  based  on  the  distribution  of  X  j.  In  doing  this,  it  is  assumed  that  the 
original  sample  X  is  fixed  and  these  derivations  are  conditional.  For  example,  the 

— ■  .it 

expected  value  and  the  variance  of  X  (the  bootstraped  sample  mean)  arc  obtained  as 
follows:  using  equation  (2.5) 
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E*(X*)  =  X  ,  (2.20) 

so  unconditionally,  the  expected  value  of  the  bootstrap  sample  mean  is 

E(X*)  =  E(X)  =  px  .  (2.21) 

The  conditional  variance  of  the  bootstrap  sample  mean  is 

Var*(X*)  =  (l/n2)Var*  [  £i  (X*j)]  (2.22) 

=  0/n2)  Ej  Var*(X*i)  +  (n(n-l)/2)Cov*(.X*i,X*j)]  . 

From  equation  (2.13),  the  conditional  variance  is  then 

Var*(X*)  =  (l/n2)EiVar*(X*i)  ]  (2.23) 

=  (l/n2)[nVar*(X*i)]  . 

Using  equation  (2.8),  finally 

Var*(X*)=  (n-l)/n2  S2X  .  (2.24) 

—  ★ 

With  this  expression,  the  unconditional  variance  of  X  is  given  by 


Var(X*)  =  E[Var*(X*)]  +  Var[E*(X*)|  .  (2.25) 

From  equation  (2.5),  and  (2.20) 

Var(X*)=  E[(n-l)/n2  S2XJ  +  Var(X) 

=  (n-l)/n2<r2x  +  °,2x/n 
=  (2n-l)/n  Var(X) 

As  mentioned  earlier,  equation  (2.24)  is  the  one  of  interest  when  one  wants  to  apply 

—  * 

the  bootstrap  mechanism  to  obtain  the  variance  of  X  .  Notice  that  as  n  -+  oo  f 
Var*(X*)  -♦  Var(X)  (2.26) 

strongly  (strong  law  of  large  numbers),  but  this  is  not  the  case  for  the  unconditional 

—  * 

variance  of  X  ,  where  as  n  -»  oo, 
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Var  (X  )  -*  2Var(X)  . 


(2.27) 


It  is  now  possible  to  define  an  estimator  for  the  MSE  of  the  mean  of  a 
population  based  on  X*: 


MSE*(X*)  =  Var*(X*)  +  [E*(X*  -  E*(X*)]2  (2.28) 

=  Var*(X*)  +  [Bias*(X*)]2 

In  the  same  manner,  the  MSE  of  any  estimator  could  be  derived.  However,  it 
is  easy  to  see  that  as  the  mathematical  definition  of  the  estimator  gets  more 
complicated,  this  procedure  can  become  very  tedious.  This  is  why  it  is  desired  to 
estimate  the  bootstrap  distribution  of  the  estimator  by  simulation  rather  than 
analytically. 

2.  Monte  Carlo  Simulation 

The  algorithm  presented  in  Chapter  II,  Section  A,  could  be  expanded  to  allow 

*  . 

Monte  Carlo  simulation  to  approximate  the  bootstrap  distribution  ofO  (F  ).  As  before 
(See  Efron  [Ref.  2:  Section  2]  ): 

(1)  given  that  the  realization  of  the  random  vector  X  has  been  observed,  say  Xj 
=  x-  for  i=  1,  2,...,  n  ; 

(2)  construct  the  sample  probability  distribution  Fh  ,  by  giving  a  mass  1/n  at  each 
point  xx  ,  x2  ,  .  .  .  ,  xn  , 

(3)  keeping  Xj  (and  thus,  Fh  )  fixed,  draw  with  replacement  a  random  sample  of 
size  n  from  Fh  ,  and  call  this  a  bootstrap  sample; 

(4)  from  this  random  sample,  compute  the  bootstrap  replication,  0j*(Fh);  i.e, 
compute  the  value  of  the  desire  statistic  based  on  the  sample  from  Fh  .  Then, 

(5)  do  steps  (3)  and  (4)  a  "large"  number  B  of  times.  In  this  way  one  obtains 
independent  bootstrap  replications  of  0*(Fh),  say  0*j(Fh),  0*2(Fh),...,  0+g(Fh) 

> 

*  . 

(6)  now,  approximate  the  variance  of  0  (F  )  by  the  sample  variance 

Var*h  [0*(Fh)]  =  £i  [0*i(Fh)  -  0*(Fh)]2  /  (  B  -  1  ) ,  (2.29) 


where  i=  1,  2,...,  B,  and 


The  MSE  of  0*(Fh)  may  be  estimated  by 

MSE*h(0*(Fh))  =  Var*h[0*(Fh)]  +  [BIAS*h(0*(Fh)]2  . 


(2.31) 


It  will  be  seen  in  Chapter  Three  that  as  B  and  n  get  large  MSE*h(0  (Fh))  approaches 
zero.  A  problem  in  using  the  bootstrap  is  the  choice  of  B,  and  we  consider  this  in 
Chapter  Three. 

This  bootstrap  simulation  procedure  was  carried  out  to  study  the  effect  of 
possible  choices  of  B,  in  terms  of  the  estimated  MSE  of  several  estimators.  The  reader 
will  see,  in  the  next  chapter,  that  the  choice  of  B  should  depend  on  the  sample  size  n, 
the  specific  estimator  under  consideration  and  the  structure  of  the  population  from 
which  the  sample  was  obtained. 

a.  The  Statistical  Experiment 

In  this  thesis,  various  experiments  were  conducted  to  study  the  problem 
of  selecting  B.  The  main  idea  behind  these  experiments  was  to  select  some  well  known 
probability  distributions  and  some  parametric  estimators  for  which  the  distributional 
characteristics  are  well  known.  Then  the  MSE  of  these  estimators  could  be  determined 
theoretically.  Therefore,  one  could  compare  this  true  MSE  with  the  estimated  MSE  of 
the  estimators  obtained  using  the  bootstrap  mechanism. 

The  critical  part  of  the  experiment  was  to  design  an  effective  computer 
code  to  perform  the  Monte  Carlo  simulation.  The  FORTRAN  program  developed  to 
carry  out  the  simulation  reported  here  is  listed  in  Appendix  B.  This  program  was  used 
to  analyze  the  performance  of  eight  different  estimators  based  on  the  bootstrap 
methodology.  These  were  the  sample  mean,  variance  (three  different  estimators), 
coefficient  of  correlation,  coefficient  of  variation,  the  five-percent  trimmed  mean,  and 
the  median. 

The  simulation  runs  as  follows  (See  Appendix  B): 

(1)  n  random  variates,  for  up  to  8  values  of  n,  are  first  generated  representing  a 
random  sample  from  a  population  F.  (  In  the  simulation  a  total  of  N  random 
variables  arc  first  generated,  then  sectioned  into  samnles  of  sizes  n^  where  i  = 
1,  2,  ....  8.) 

(2)  For  each  subsample  of  size  n  ,  a  bootstrap  function  is  called  to  generate  a 
bootstrap  sample  from  the  original  sample.  Then,  the  estimator  function  is 


called  to  produce  a  desired  estimate.  This  step  is  repeated  until  B  bootstrap 
samples  from  the  original  sample  are  obtained. 

(3)  After  the  B  estimates  have  been  obtained,  the  statistics  function  is  called  to 
calculate  the  mean  of  these  estimates,  this  number  is  one  of  the  0  -(F  ). 

(4)  In  order  to  improve  the  precision  of  the  simulation  process,  steps  (2)  and  (3) 
are  replicated  M  times.  Then,  the  process  will  produce  a  total  of  (N  x  M)/  n 
estimates.  From  these  estimates,  a  box-plot  is  constructed  and  estimates, 
including  MSE,  are  calculated. 

In  the  next  chapter  some  of  the  results  obtained  from  this  simulation 
process  are  analyzed. 


III.  APPLICATION  OF  THE  BOOTSTRAP  METHOD  :  SOME  RESULTS 


A.  THE  MEAN.  VARIANCE  AND  THE  COEFFICIENT  OF  VARIATION  OF 

EXPONENTIAL  RANDOM  VARIATES 

The  first  experiment  conducted  was  intended  to  analyze  the  bootstrap  mechanism 
in  estimating  the  MSE  of  the  estimators  for  the  mean,  variance  and  coefficient  of 
variation  of  a  sample  coming  from  a  population  of  exponential  random  variates  with 
parameter  X  =  1.  The  population  coefficient  of  variation  is  defined  as: 

CV(X)  =  <rx/Mx  (3.1) 

In  the  Exponential  1)  case,  the  mean,  variance  and  the  coefficient  of  variation  have  the 
same  value  of  1.  With  this  first  fact  in  mind,  the  MSE  of  sample  mean,  as  an  example, 
is  defined  using  (2.21)  and  (2.28)  as: 

MSE(X*)  =  Var(X*)  +  [E(X*  -  jix)]  2  .  (3.2) 

Conditionally,  from  (2.26),  an  estimate  of  (3.2)  is: 

MSE*h(X*)  =  [(n-i)/n2  Sx2  ]  +  [E*(X*  -  l)]2  .  (3.3) 

In  the  same  manner,  the  MSE  for  the  variance  and  coefficient  of  variation  could  be 
estimated.  These  estimates  were  obtained  using  the  algorithm  described  in  the 
preceding  section.  The  sample  sizes  for  this  experiment  were:  n  =  10,  20,  25,  40,  50,  70 
,100,  140.  Each  estimator  was  bootstraped  using  B  =  5,  8,  10,  15,  20,  25,  40,  60,  100, 
140,  and  500.  Figures  3.1,  3.2  and  3.3  below,  show  how  the  MSE*h  for  the  mean, 
variance  and  coefficient  of  variation  respectively  decreases  as  both  n  and  B  increases. 

A  remarkable  feature  of  these  plots  is  that  the  MSE*h  of  the  bootstrap  sample 
variance  (Figure  3.2)  decreases  much  faster  as  the  sample  size  increases  than  when  B 
increases.  Observe  the  big  jump  in  the  MSE*h  when  n  goes  from  10  to  40  relative  to 
that  of  B  going  from  5  to,  say,  40:  the  jump  is  much  greater  in  the  former. 

Another  observation  of  interest  is  that  the  MSE*h  of  the  estimates  decreases  as 
B  increases,  but  beyond  a  certain  threshold  very  slowly.  Indeed,  the  decrease  in  MSE*h 
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Figure  3.3  MSE*  of  Bootstrap  Coeff.  of  Variation:  Exp(l). 


beyond  B  £  50  is  barely  noticeable.  For  example,,  see  Figure  3.2,  the  MSE*h  of  the 
sample  variance  decreases  only  by  one-thousandth  of  a  unit  when  B  is  increased  from 
200  to  500  replications.  This  is  also  true  for  the  sample  mean.  However,  for  the 
coefficient  of  variation  (see  Figure  3.3),  the  MSE*h  improved  about  two  percent  (.02) 
in  the  same  range  for  a  small  sample  size  (n=  10).  These  results  give  an  idea  of  the 
performance  of  the  MSE  of  the  bootstrap  estimates  of  a  given  estimator.  It  should  also 
suggest  to  the  statistician  that  once  the  estimators  are  performing  fairly  well  (i.e.,  once 
this  threshold  has  been  attained),  there  is  no  reason  to  increase  the  amount  of 
bootstrap  replications,  since  this  will  not  induce  a  great  improvement  in  the  estimates. 
An  important  point  here  is  that  when  an  attempt  is  made  to  estimate  the  sample 
variance  using  the  bootstrap  method,  the  number  of  bootstrap  replications  should  be 
greater  than  100  in  order  to  decrease  the  MSE*h  below  0.6. 

The  bootstrap  distribution  of  some  of  the  estimators  are  shown  in  Figures  3.4, 
and  3.5  in  the  form  of  boxplots  and  a  summary  of  the  distributional  statistics.  These 
were  obtained  by  using  a  statistical  package,  called  SMTB10,  developed  at  NPGS  (See 
Appendix  B).  This  package  was  modified  by  the  author  of  this  thesis  in  order  to  obtain 
MSE*h.  Each  boxplot  represents  the  distribution  of  the  bootstrap  estimator  based  on 
the  sample  size  n. 
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Notice,  in  Figure  3.4,  that  the  distribution  of  the  bootstrap  sample  mean 

resembles  a  Normal,  as  would  be  expected  by  the  Central  Limit  Theorem,  with  the 

Kurtosis  and  Skewness  oscillating  around  zero,  as  n  increases.  Recall  from  previous 
.  .  * 

section  that  the  standard  deviation  of  X  ,  in  the  case  of  Figure  3.4,  would  be  estimated 
by 

STD*h(X*)  =  STD*h/Vn*,  n*  =  N  *  M/NE(1) 

and  STD*h  is  the  value  shown  on  the  bottom  table  of  this  figure.  Figure  3.5  shows  the 
distribution  of  the  bootstrap  sample  variance  (3.5).  Looking  at  the  distribution 
summary,  one  can  say  that  this  distribution  is  quite  similar  to  that  of  a  scaled 
Gamma(k,P)  distribution.  Again  as  n,  increases  the  Kurtosis  and  Skewness  get  closer  to 
that  of  the  Gamma,  say  6/k,  and  2/Vk  respectively.  Figure  B.4  and  B.5,  Appendix  B, 
show  the  distribution  of  the  same  estimators  when  B  =  150.  It  is  easy  to  see  that  the 
distributional  characteristics  for  the  estimators  follow  the  same  patterns  as  those 
discussed  above,  where  B  =  5.  The  only  difference  there  is  that,  as  expected,  the 
number  of  outliers  decreases  significantly  particularly  in  the  case  of  the  sample 
variance. 

B.  THE  SAMPLE  VARIANCE 

This  experiment  was  intended  to  further  study  the  behavior  of  the  bootstrap 
sample  variance  for  populations  with  various  distributions.  The  ones  discussed  in  this 
section  are  the  GAMMA(0.5,1),  NORMAL(O.l)  and  LAPLACE(0,1).  For  this 
experiment,  the  sample  size  where  n  =  5,  10,  20,  25,  30,  50,  60,  and  B=  5,  8,  10,  15, 
20,  25,  30,  35,  40,  50,  100,  and  500.  In  the  first  two  cases,  the  GAMMA  and 
NORMAL  distributions,  the  bootstrap  sample  variance  seems  to  approximate  the 
population  variance  fairly  well  when  n  >  50,  where  the  MSE*h  is  less  than  0.10. 
Figures  3.6,  3.7,  and  3.8  show  the  relation  between  B,  n,  and  the  MSE*h  of  the 
bootstrap  sample  variance  for  a  Gamma(0.5,l),  Normal(0,l),  and  Laplacc(O.l) 
respectively. 

Notice  that  there  is  a  lot  of  random  variation  in  the  MSE*h  when  B  is  in  the 
range  5  ^  B  <  50  for  n  ^  30,  and  for  B  <  25  when  30  <  n  ^  60.  This  random  noise 
extends  beyond  these  ranges  in  the  case  of  the  Gamma(0.5,l).  Notice  that  in  Figure 
3.6,  the  lines  for  the  MSE*h  of  the  sample  variance  when  n=  15,  and  20  are  above 
that  when  n=  10  for  B  <  300.  However,  when  B  =  500,  these  lines  lie  below  the  one 
corresponding  to  n=  10.  The  MSE*h  for  n=  15,  and  20  is  actually  less  than  the  MSE*h 
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Figure  3.6  MSE*h  of  Bootstrap  Sample  Variance  of  a  G(0.5,l). 
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Figure  3.7  MS F*h  of  Bootstrap  Sample  Variance  of  a  N(O.l). 
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Figure  3.8  MSE*h  of  Bootstrap  Sample  Variance  of  a  L(0,I). 

for  n=  10  just  after  B  >  150.  In  this  experiment,  it  is  also  true  as  found  for  the 
Exponential(l),  that  MSE*h  decreases  faster  as  n  decreases  than  when  B  increases.  This 
was  also  the  result  in  the  case  of  the  Lapiace(O.l).  However  (notice  the  scale  of  the 
MSE  in  this  case),  the  MSE*h  is  quite  high.  Figure  3.8  shows  that  for  a  sample  of  size 
n  ^  15,  the  MSE*h  >  1.0  even  when  B  is  as  large  as  500.  It  was  suspected  that 
probably  this  high  MSE*h  was  caused  by  the  mechanism  used  to  generate  Laplace 
random  variates.  The  first  method  used  in  this  experiment  takes  the  difference  of  two 
Exponential  (l)  variates.  The  second  method  generates  an  Exponential  I)  and  converts 
it  to  a  Negative-Exponential(l)  with  probability  .5  .  The  histograms,  using  different 
sample  sizes,  showed  that  the  first  algorithm  used  to  generate  Laplace  random  variates 
was  the  most  effective.  In  any  case,  the  point  here  is  that  for  the  ranges  of  n  and  B 
used  in  the  experiment,  the  MSE**1  of  the  sample  variance  for  a  Laplace(O.l)  never 
decreased  below  0.2.  This  was  not  the  case  for  the  other  distributions.  This  suggests 
that  the  performance  of  the  bootstrap  method  depends  on  the  distributional  properties 
of  the  population  in  question  as  well  as  the  estimator  under  consideration. 
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C.  THREE  DIFFERENT  ESTIMATORS  FOR  THE  VARIANCE 

In  Chapter  Two,  the  expected  value  and  the  variance  of  the  bootstrap  sample 
* 

mean  (X  )  were  derived.  In  this  section,  the  expected  value  of  the  bootstrap  sample 

ijt  p 

variance,  call  this  XS  z  ,  is  calculated.  Let 

XS*2  -  [Ei(X*rX*)2]/(n-  i)  (3.4) 

=  (Ij  Xj*2  -  nX*2]  /  (n  .  1)  . 

Note  that 

E*(X  V)  =  d/n)Ii  X2i  (3.5) 

so  that 

E*(Xi  Xj*2)  =  Si  X2i  (3.6) 

—  * 

Likewise  the  second  moment  of  X  is  given  by: 

E*(X*2)  =  (l/n2)(Si  Xi*2+  Silj  E(X*jX*j)]  i  *  j  (3.7) 

♦  *  . 

As  before,  (X  jX  j)  has  probability  (l/nz)  of  being  any  point  of  the  form  (x^xj)  so 
from  (2.7) 

E*(X*i,X*j)  =(l/n2)  El  Si  Xj2  +  Silj  xixj  ]  (3.8) 

=  (l/n2)Si  Xj2+  SiSj  (XjXj)/nz  . 

Now 

SS  X*jX*j  =  (n(n-I)/n2)(Si  X'i  +  lilj  XiXj)  (3-9) 

=  ((n-l)/n2)(Si  xi)2 

=  n(n-l)X2 

Then  (3.7)  can  be  expressed  as 


E*(X*2)  =  (I/n2)(Si  X2i  +  n(n-l)X2j 
Finally,  using  (3.6)  and  (3.9),  the  conditional  expected  value  of  XS  is 


(3.10) 


i'*.  v  iw 


E*(XS**)  «  (l/(n-l))E*(Xi  XV  +  nX*2  )  (3.11) 

=  l/Cn-lX^i  E*(XV)  -  nE*(X*2  )] 

=  l/(n-l)  {  Xi  Xj2  -  [( 1/n)  (Si  Xj2  +  n((n-l))X2]) 

=  lAn-Oian-O/n^iV-tn-OX2] 

+  =  Si  (Xi2  -  X)2  /  n. 

Call  this  <rs  2.  Now  suppose  it  is  known  that  X  ~N(h,(T2)-  this  restriction  is  not  really 
required  in  this  context  -  and  it  is  desired  to  estimate  the  variance  of  X  using  the 
bootstrap  method.  As  shown  in  the  previous  chapter, 

E(X*)  =  fix  ,  (3.12) 

so  the  unconditional  expected  value  of  ,S  is: 

E(jS*2)  =  E*[E(1S*2|X)]  (3.13) 

=  E[(S(Xj  -  X)2  )/  n  ] 

=  ((n-l)/n)<xx2 

Then  jS  2  is  a  biased  estimator  for  <TX2.  The  finite  population  correction  factor  might 
thus  be  suggested  to  improve  the  performance  of  XS  2  .  Define 

2S*2  =  (n/(n-  i))  XS*2  =  n/(n-l)8Si(Xi*-xV  (3.14)  _ 

an  unbiased  bootstrap  estimator  of  <rx2  .  Analyzing  expression  (2.5)  and  (3.11),  yet 
another  estimator  for  <rx  can  be  suggested.  Since  the  value  of  E*(Xj  )=X  is  known, 
the  following  estimator  for  <rx2  also  seems  reasonable: 

jS*2  =  I(xVx)2/n  (3.15) 

The  third  experiment  was  conducted  to  compare  the  performance  of  these  three 
estimators  (3.4),  (3.14),  and  (3.15).  Figures  3.9,  3.10,  and  3.11  show  the  results  of  this 
experiment. 

As  can  be  seen,  the  third  estimator,  3S  ,  in  almost  all  cases  outperforms  the 
other  two  for  all  different  sample  sizes  tried  in  this  experiment.  Even  the  second 
estimator  (3.14)  performs  almost  as  good  as  XS  when  n  >  50.  When  n  ^  50,  the 


29 


.  i".  •■.V.V.V.' /.V.V.V.V >■- .’..■'.  V.V.V.'.-  /■vlVv^ 


f-  -v  V*  -  A-  WL-  ^  ^ 


J  J 


30 


Figure  3.11  MSE*h  of  the  3rd  Variance  Estimator  of  a  N(0,1). 


difference  between  these  three  different  estimators  is  barely  noticeable.  However,  for 
very  small  samples,  n  <  20,  3S*2  is  definitly  a  better  estimator  for  a2  than  XS*2  . 
Efron  [Ref.  1]  has  suggested  the  use  of  jS*2  as  the  bootstrap  estimator  of  the  sample 
variance.  As  the  plots  suggest,  it  could  be  now  recommended  the  use  of  3S  z  and 
even  2S  2  (for  larger  samples,  n  >  50)  rather  than  jS*2  to  estimate  the  sample 
variance.  Note  that  as  n-»oo,  xS*2  is  the  same  as  ZS  2  .  (Note:  these  two  estimators 
(3.14)  and  (3.15)  are  called  VAR1A2  and  VARIA3  respectively  in  the  FORTRAN 
code,  listed  in  Appendix  A). 

D.  THE  CENTER  OF  A  DISTRIBUTION:  COMPARISON  OF  THE  MEAN, 

MEDIAN  AND  TRIMMED  MEAN 

The  sample  mean  is  the  most  used  estimator  for  the  center  of  a  distribution. 
However,  two  other  estimators  are  also  used,  specially  for  symmetric  distributions:  the 
median  and  the  5%  trimmed  mean.  There  have  been  many  comparisons  of  the 
asymptotic  performance  of  these  three  estimators.  Lehman  [Ref.  8]  has  calculated  the 
asymptotic  values  of  these  estimators  in  case  when  the  sample  is  from  a  Normal(0,l)  or 
a  Laplace(0,l)  population.  These  calculations  are  summarized  in  Table  1  below. 


TABLE  1 


ASYMPTOTIC  VARIANCE  OF  THE  MEAN,  MEDIAN 
AND  5%  TRIMMED  MEAN 


Probability 

Distribution 


ESTIMATOR 


Mean 

1.0/n 

2.0/n 


Median  5%  Trimmed  Mean 

1.57/n  1.01/n 

1.00/n  1.65/n 


These  values,  among  other  things,  show  that  for  the  case  of  sample  coming  from  a 
Normal(0,l),  the  mean  has  less  asymptotic  variance  than  the  other  estimators. 
However,  if  the  data  comes  from  a  population  with  heavy  tails,  like  the  Laplace,  the 
median  is  a  better  estimator  asymptotically  (having  less  variance).  The  5%  trimmed 
mean  is  a  compromise  between  the  other  two:  it  should  used  when  the  practitioner 
does  not  know  the  nature  of  the  tails  of  the  population. 

A  fourth  experiment  was  conducted  to  see  if  these  observations  hold  when  the 
corresponding  bootstrap  estimators  are  used.  In  this  experiment,  the  MSE  of  of  the 
bootstrap  estimators  were  compared  with  the  asymptotic  MSE  for  the  usual  estimators 
as  B  increases.  The  asymptotic  MSE  (call  it  MSE^)  of  the  three  estimators  could  be 
estimated  by  adding  the  asymptotic  variance,  as  defined  in  Table  1,  plus  the 
bias-squared.  The  MSE^  was  compared  with  the  MSE*h  of  the  bootstrap 
estimators,  for  several  sample  sizes,  as  B  increases. 

Figures  3.12,  3.13,  and  3.14  summarize  the  results  of  this  comparison  for  the  case 
of  a  Normal(0,l)  population.  Figures  3.15,  3.16,  and  3.17  show  the  results  for  a 
Laplace(O.l)  population. 

In  these  figures,  the  solid  horizontal  lines  represent  the  values  of  the  asymptotic 
MSE  of  the  usual  estimators.  For  example,  in  Figure  3.12  the  estimated  asymptotic 
MSE  of  the  sample  mean  for  a  sample  of  size  n=5  is  approximately  1/5.0  + 
(BIAS)Z~,20.  The  dotted  line  represents  the  estimated  MSE  of  the  bootstraped 
estimators  as  B  increases. 

In  summary,  for  the  Normal(0,l)  population,  the  bootstraped  sample  mean  and 
the  5%  trimmed  mean  have  less  error,  asymptotically;  they  arc  estimating  the  center  of 
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3.16  Asymptotic  MSE  of  the  Sample  Median  of  a  L(0,1). 
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Figure  3.17  Asvmptotic  MSE  of  the  Sample 
5%  Trimmed  Mean  of  a  L(0,1)- 


the  distribution  with  much  better  precision  than  the  bootstrap  sample  median. 
Comparing  Figures  3.12  and  3.13,  it  looks  obvious  that  for  sample  sizes  n^60  the 
bootstraped  sample  mean  shows  much  smaller  MSE  than  the  bootstraped  sample 
median.  When  the  sample  size  is  n=60  there  is  no  distinguishable  difference  between 
the  estimated  MSE's  of  these  two  estimators.  Notice  that  the  bootstraped  5%  trimmed 
mean  (Figure  3.14)  seems  to  perform  as  well  as  the  bootstraped  sample  mean;  it  is 
better  for  very  small  samples,  say  for  n=5,  10,  and  15.  This  confirms  the  general 
relationship  among  these  estimators,  even  in  the  case  of  bootstraping  the  estimators, 
that  the  5%  trimmed  mean  is  a  robust  compromise  between  the  sample  mean  and  the 
sample  median. 

The  results  obtained  in  this  experiment,  however,  do  not  agree  with  the  classical 
theory  in  the  case  of  the  Laplace  population.  In  this  case  the  bootstraped  sample  mean 
outperforms  the  bootstraped  sample  median  in  estimating  the  center  of  the 
distribution,  for  sample  size  n^  20.  For  a  sample  of  size  n  =  60,  there  is  no  real 
difference  between  these  two  estimators,  in  terms  of  MSE*h.  Notice  that  the  5% 
trimmed  mean  (Figure  3.17)  performs  better  than  the  bootstraped  sample  median 
(Figure  3.16)  for  the  cases  where  n<60,  but  in  turn,  is  outperformed  by  the 
bootstraped  sample  mean  (  Figure  3.15). 

E.  LINEAR  REGRESSION  BY  BOOTSTRAPING  THE  RESIDUALS 

In  a  final  experiment,  linear  regression  estimation  was  considered.  In  this  case, 
there  is  a  choice  of  bootstraping  methods;  however,  in  this  thesis  only  one  method  is 
considered.  The  method  considered  here  relies  on  bootstraping  residuals  to  estimate 
the  variance  of  the  ph  vector(Ph  stands  for  "  P  hat").  A  measure  to  estimate  the  MSE 
of  this  vector  is  also  introduced. 

In  the  typical  linear  regression  problem  there  are  n  independent  observations 
(real-valued)  Y-  and  it  is  assumed  that  the  following  model  holds: 

Y  =  Xp  +  e  ,  (3.16) 

where  €  is  a  random  sample  from  some  population  F,  and  P  is  a  p  x  l  vector  of 
unknown  parameters  that  must  be  estimated.  All  that  is  assumed  about  F  is  that  it  is 
centered  at  zero,  E(c)=  O  and  Cov(c)=(T2  I  .  One  way  of  estimating  P  is  by  the 
commonly  used  least  squares  method,  in  which  the  sum  of  the  squared  distances 
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(3)  Using  the  same  fitting  technique  used  to  obtain  Ph  in  the  original  problem, 
calculate  p  .  Then  obtain  an  estimate  of  p  : 

6*  =  (XX)'1  XY*  (3.21) 

(4)  Repeat  steps  (2)  and  (3)  B  times  obtaining  independent  bootstrap 

♦  ♦  , 
realizations  5  r  b  6  g  .  Then  the  covariance  of  ph  can  be  estimated  by 

$ 

the  sample  covariance  matrix  of  the  f>  ^  ,  b  =  1,  2 . B. 

Efron  has  shown  (See  [Ref.  1:  page  18)  )  that  as  B  -»  so  , 

Var(P*)  =  ((n-p)/n)  (X'  XyV  (3.22) 

where  <r2  is  an  unbiased  estimate  of  the  variance  of  Y-  .  In  this  procedure,  <r2  can  be 
estimated  by  2S  .  It  can  be  seen  that  as  B  -»  so  , 

Var(P*)  -♦  Var(ph  )  .  (3.23) 

The  following  experiment  was  conducted  to  estimate  the  MSE  of  ph.  Suppose  it 
is  known  that  the  observations  Yj  come  from  a  Normal(O.l).  Then  the  true  value  of  the 
P-  vector  in  the  regression  model  (3.17)  is  P  =  (0,0,0),  so  the  E(p)  =  O  and  the 
variance-covariance  matrix  of  P  is  Ep  =  <r2(X'  X)'1  ,  where  it  is  known  that  <r2  =  1. 

For  this  experiment,  a  design  matrix  X  of  orthogonal-column  vectors  was 
created.  This  matrix  has  l's  in  the  first  column;  then  a  series  of  n  alternating  l's  and 
-Is  in  the  second  column;  and  finally  the  third  column  (for  p=  3  )  is  a  series  of  two  l's 
and  two  -l's  (also,  n  =  2X  ,  x  =  2,  3,  4,...  ).  Then  it  was  possible  to  readily  calculate 
Ph  ,  by 

Ph  =  (1/n)  (X'  Y)  .  (3.24) 

The  bootstrap  algorithm  described  above  was  used  to  generate  a  sample  of  p+j  .  Then, 
an  estimate  of  p-  is 

=  (1/n)  (X'  Y*) .  (3.25) 
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It  was  desired  to  develop  a  measure  of  precision  for  p  analogous  to  MSE,  which 
depends  on  Var(P  )  and  the  bias  of  P  .  Define 


MSE(p  )  =  E[(P  -  E(p  ))2]  . 


(3.26) 


Recall  that  in  this  experiment  the  E(ph  )  =  O.  Then,  (3.26)  could  be  estimated  in  the 


following  way: 


1)  Do  step  (4),  as  above,  obtaining 

\1SE*(P*)=  -  E(Ph))2]  ,  B  i=  1,2 . B  (3.27) 

=  (Zi  ilPi*  -  P  II  2  ]  /  B  . 

2)  Repeat  (1)  a  number  of  M  times  to  obtain  an  average  MSE*h  of  the  procedure 
(3.27). 

The  results  of  this  expenment  are  shown  in  Figure  3.18. 


Here,  the  sample  sizes  were  taken  as  n  =  4,  8,  16,  32,  64,  and  128,  and  M  =  15. 
The  estimator  P  was  bootstraped  a  number  B  =  5,  10,  15,  20,  30,  40,  50,  100,  150, 
and  500.  The  results  obtained  were  surprising.  When  the  number  of  observations  is 
small,  n<33  ,  the  MSE*h  of  the  estimator  is  relatively  high  (MSE*h  >  .09)  even 
when  B  is  as  large  as  500.  When  n  >65,  there  is  some  improvement  in  the  MSE*h  ;  in 
this  case,  the  MSE*h  is  at  least  5%  lower  that  when  to  n  <  33.  It  is  interesting  to  see 
that  increasing  B  from  5  to  500  there  is  no  remarkable  gain  in  the  precision  of 
estimator  when  n  >  65;  the  MSE*h  oscillates  around  the  same  value.  Now,  when  n  < 
33,  increasing  B  by  the  same  amount,  the  MSE*h  decreases  but  less  than  1%  of  its 
initial  value.  It  seems  that  in  the  linear  regression  estimation  the  key  problem  is  the 
size  of  n  and  not  of  B. 

When  using  this  method  for  estimating  the  MSE  of  ph  ,  the  practitioner  must 
bear  in  mind  that  it  involves  the  residual  distribution  and  hence  assumes  that  the  linear 
model  is  correct. 
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IV.  CONCLUSIONS 


As  it  has  been  shown,  the  Bootstrap  is  an  accurate  method  for  estimating  the 
precision  of  the  estimates  and  for  estimating  the  distribution  (or  some  feature  of  the 
distribution)  of  an  estimator.  For  MSE,  the  number  B  required  to  obtain  a  certain 
degree  of  accuracy  will  vary  depending  mainly  on  the  population  (this  is  a  subject  for 
further  studies)  and  the  type  of  the  estimator  used  for  estimation.  It  was  found  that 
when  the  sample  comes  from  a  population  having  heavy-long  tails,  such  as  the  Laplace 
distribution,  the  bootstrap  estimator  for  the  mean  is  a  better  estimator  for  estimating 
the  center  of  the  distribution  than  the  median  or  the  5%  trimmed  mean;  where  in  the 
case  of  using  nonbootstrap  estimators,  the  median  is  a  better  estimator  than  the  other 
two  estimators. 

In  estimating  the  variance  of  a  population,  it  was  found  that  there  exists  an 
estimator  that  is  more  accurate  than  the  typical  estimator  recommended  in  the 
bootstrap  literature.  This  estimator  (3S  2)  relies  on  the  fact  that  the  original  sample 
mean  in  the  bootstrap  method  is  known.  Once  this  value  is  calculated,  there  is  no  need 
to  find  X+  for  each  bootstrap  sample,  since  X  is  fixed  through  the  process.  Another 
estimator  for  <y2  was  also  proposed,  ZS*2  .  This  estimator  is  unbiased,  where  XS+2  is 
not,  but  for  small  sample  sizes,  n  <  30  ,  is  not  as  accurate  as  3S  z.  It  should  be 
emphasized  that  in  using  this  estimator,  3S  ,  one  can  reduce  the  computer  time 
required  to  estimate  <r2.  Hence,  this  is  another  advantage  in  using  this  estimator. 

In  the  linear  regression  estimation,  using  as  a  measure  of  precision  definition 
(3.28),  it  wras  found  that  the  bootstrap  method  analyzed  in  this  thesis  gives  estimates 
with  small  MSE*h  with  relative  small  sizes  of  B,  but  for  relatively  large  sample  size,  n 
>  60.  When  the  sample  size  is  small,  increasing  B  up  to  500  will  result  in  a  gain  of 
around  1%  in  the  precision  of  the  estimates.  Thus,  in  the  linear  regression  estimation 
the  critical  issue  for  MSE  is  the  sample  size.  It  was  also  noted  that  the  disadvantage  of 
this  method  is  that  it  assumes  that  the  model  in  question  is  correct. 

The  result  that  seems  to  apply  to  all  cases  studied  in  this  work  is  that,  in  using 
the  bootstrap  method  for  estimating  MSE  of  some  parameter  0  ,  there  really  exits  a 
tradeoff  between  B  and  n:  as  n  increases,  one  can  significantly  decrease  B  and  still  get 
very  precise  estimates.  However,  no  matter  what  n  is,  once  some  degree  of  accuracy 


has  been  obtained,  there  is  no  reason  to  increase  B  much  more  since  this  will  not 
induce  greater  precision  in  the  estimates.  In  Appendix  C  ,  the  reader  will  find  tables 
that  provide  information  about  this  tradeoff  for  given  estimators  and  populations. 
Analyzing  the  figures  presented  in  previous  chapters  and  these  tables,  a  rule  of  thumb 
about  the  relation  between  n  and  B  can  be  hypothesized.  The  following  rule  seems 
reasonable:  make  the  number  B  ~  1000/n.  In  almost  all  cases  studied  here,  this  rule 
yielded  estimates  with  MSE*h  <  0.05  (note:  independent  of  n,  making  40  <  B  > 
60  will  also  produces  estimates  with  small  MSE*h  ).  The  only  exception  is  when  the 
population  in  question  was  Laplace(O.l).  This  is  an  area  that  needs  further  study. 

Finally,  it  was  found  that  a  (possibly  not  serious)  disadvantage  in  using  the 
bootstrap  method  is  the  computer  time  required  to  obtain  the  estimates.  For  example, 
in  estimating  the  variance  of  a  Gamma(0.5,l)  distribution,  increasing  B  from  20  to  100 
increased  the  CPU  time  of  the  IBM  3033-A16  system  used  in  this  experiment  about 
75%.  This  time  is  increased  at  least  another  50%  if  one  desires  to  obtain  the 
distributional  characteristics  of  the  estimator  (i.e.,  boxplots).  However,  in  view  of  the 
decreasing  cost  of  computer  time,  this  does  not  seem  to  be  a  major  obstacle  for  using 
this  method. 


APPENDIX  A 

LIST  OF  SPECIAL  NOTATIONS 


(1)  6h 

:0  -hat,  estimator  of  0 

(2)  Fh 

empirical  probability  distribution 

(3)  0*(F*) 

:the  value  of0  based  on  bootstrap  method 

(4)  X* 

:a  bootstrap  random  sample 

(5)  MSE*h 

:estimated  MSE  based  on  bootstrap  method 

(6)  Ph 

:estimator  of  the  p  x  l  P  -vector 

(7)bh 

:an  estimate  of  Ph 

(8)  P* 

:estimator  of  P  based  on  bootstrap  method 

(9)  b* 

:an  estimate  of  p 

APPENDIX  B 

FORTRAN  CODE  FOR  BOOTSTRAPING 


This  program,  called  BOOTST,  was  developed  to  estimate  distributional 
properties  of  some  statistical  estimators  using  the  Bootstrap  Method.  Also  it  is  possible 
to  obtain  estimates  of  the  MSE  of  the  estimators.  The  code  was  written  in  FORTRAN 
77.  It  can  generate  a  random  sample  for  Monte  Carlo  simulation  or  can  read  the 
sample  data  by  a  CALL  to  a  subroutine  FDATA  (at  the  end  of  the  code  listed  below). 
The  user  can  generate  samples  from  the  following  distributions:  Exponential(X), 
Laplace(0,l),  Uniform(0,l),  Normal(0,l),  Gamma(a,I),  Poisson(X),  and  the 
Geometric(p).  The  parameters  a,  X,  and  p  can  be  specified  by  the  user  within  the 
appropriate  function.  With  this  program,  the  user  can  study  the  distributional 
properties  of  the  following  bootstrap  estimators:  mean,  variance  coefficient  of 
variation,  serial  correlation,  median,  and  the  5%-trimmed  mean.  Also,  one  can  obtain 
estimates  of  the  "P  -vector"  in  the  case  of  the  linear  regression  estimation  by 
bootstraping  the  residuals  (  See  Chapter  Three,  Section  D  ).  The  program  is  structured 
in  five  main  sections:  the  MAIN  program,  to  include  input  requirements;  the  DATA 
GENERATION,  the  ESTIMATORS  definition,  the  BOOTSTRAP  SAMPLING 
mechanism,  and  the  STATISTICS  sections. 

The  program  can  be  used  in  two  ways.  The  first,  makes  use  of  another  program 
called  SMTB10.  This  code  was  developed  at  the  NPGS  by  Prof.  P.A.W.  Lewis,  and 
Mr.  Luis  Uribe  (See  [Ref.  9]  ).  It  is  highly  recommended  that  the  user  become  familiar 
with  the  documentation  of  STMB10  before  attempting  to  use  BOOTST.  In  general, 
when  using  this  option,  the  user  must  create  an  input  file  containing  the  parameters 
specified  in  the  input  section  of  BOOTST.  Then,  a  CALL  is  made  to  STMB10,  and  in 
turn  STM  BIO  will  make  various  sequential  calls  to  generate  the  data,  calculate  the 
values  of  the  desire  estimators  (using  the  bootstrap  mechanism),  and  produce  the 
statistics.  When  a  call  to  STM  BIO  is  made,  the  user  could  produce  estimates  for  1,  2, 
or  3  different  estimators  using  1,  2,  or  3  sample  data  generators  or  any  of  the  eight 
possible  combinations.  Also,  the  user  could  select  up  to  8  different  sample  sizes  for 
each  estimator.  Therefore,  in  one  execution,  statistics  for  up  to  three  different 
estimators,  using  up  to  three  different  data  generators,  and  for  up  to  eight  different 


sample  sizes  can  be  obtained  using  the  bootstrap  method.  These  options  are  controlled 
in  the  INPUT  requirements  of  BOOTST.  At  the  end  of  each  execution,  BOOTST  will 
send  to  a  printer  (or  to  the  screen,  depending  on  the  option  selected)  a  file  containing 
boxplots  and  a  summary  of  the  statistics  for  each  estimator.  The  input  requirements 
are  controlled  by  the  user  in  a  file  called  BOSIN. 

The  general  execution  of  BOOTST  runs  as  follows: 

(1)  For  each  estimator 

(2)  Read  Input  Requirements  ( MAIN) 

(3)  CALL  STM  BIO 

(4)  CALL  Data  Generator  ( Data  Generation  Section ) 

(5)  N=  k*  n  random  variates  are  generated,  where  k—  1  or  2,..., 
or  8  different  sample  sizes.  Then  the  data  is  sectioned  into 
samples  of  sizes  A  (K)  =  n.  If  M  repetitions  of  the  process  are 
allowed,  then  a  total  of  M  *  N  random  nwnbers  are  obtained. 

Estimates  are  calculated  for  each  sample  size  N{K). 

(6)  CALL  Estimator  Function  ( Estimator  Section ) 

Begin  Generation  of  Estimates 

(7)  For  /=  /  to  B 

CALL  BOOTSTRAP  ( Bootstrap  Section) 

CALL  STATISTIC  ’ 

Store  Bootstrap  Estimates 
CALL  STATISTIC 

Store  Mean  of  Bootstrap  Estimates 

(8)  PRODUCE  Boxplot  and  Statistics 

The  input  requirements  specific  to  BOOTST  are  explained  below,  the  other 
inputs  declared  in  the  MAIN  are  specific  to  STMB10  (  See  [Ref.  &reflO] ).  — 

(1)  ANS  :  1  or  0  :  If  the  user  wants  to  store  each  bootstrap  estimate  for  each 
estimator,  the  answer  should  be  1.  Estimates  are  stored  in  FILE  21. 

(2)  NE(I):  a  vector  containing  the  sample  sizes  (n).  Up  to  8  different  sample  sizes. 

(3)  IB:  Number  of  bootstrap  replications  for  each  execution. 

(4)  IX:  Seeds  used  to  generate  data  (up  to  3  different  seeds). 

If  the  user  desires  to  obtain  estimates  and  graphical  displays  of  two  or  more 
different  estimators  and  is  using  a  large  number  B,  say  B  ;>  60,  the  amount  of 
computer  time  required  will  increase  significantly  depending  on  the  system  used. 

The  second  way  to  execute  BOOTST  is  recommended  for  more  experienced  users 
or  for  those  who  do  not  want  to  obtain  boxplots  of  the  estimates.  This  option  will  save 
a  great  deal  of  CPU  time.  For  this  option,  the  user  will  have  to  make  some  simple 
changes  to  the  MAIN  program: 

(1)  Delete  from  the  input  requirement  section  those  inputs  that  only  apply  to 
S  I  MB  10  (those  not  listed  above). 

(2)  Replace  the  call  to  STM  BIO  by  the  following  sequence  of  calls: 


A 


(i)  Call  Data  Generator  (i.e.,  one  of  the  data  generators) 

.  (ii)  Call  Lstimator  (i.e,,  one  of  the  estimator  functions)  The  estimator 
function  (subroutine)  will  make  the  appropriate  call  to  the  Bootstrap  and 
Statistic  subroutines. 

(3)  For  this  option,  the  input  parameters  ANS  must  be  set  to  integer  1.  Also,  if 
the  user  now  make  reference  to  the  code,  it  will  be  noticed  that  each  estimator 
subroutine  has  a  special  parameter  Wl.  This  parameter  must  be  deleted 
everywhere  since  its  only  applies  to  STM  BIO. 

The  computer  code  is  listed  below. 

C  UPOATED  07-03-86  W.  CORTES-COLON 
C  MAIN  :  DECLARATION,  INPUT  SECTION  AND  CALL  FOR  SMTB10. 

COMMON  IB, 1X1, 1X2, 1X3, 1X4, ANS 
COMMON  Zl 20006) 

CHARACTERS  Tl.  T2,  T3 

REAL**,  Y< 10000  ),YMIN, YMAX,  PMEAN! 3  ) , AMSEC I  3 ) 

INTEGER  NEI8  ),D,RG,SEI, SVS,N,M,L, NEST, NSR 
INTEGER  IX1.IX2,IX3 ,1X4 , IB , ANS 

EXTERNAL  XMEAN, VARI A ,COE VA ,SECOR .MEDIA .TRIMM, VARI2, VARI3 .BLREG 
EXTERNAL  EXPON , UNI FO , NORML ,  GAMAF,  POISF,  GEOMF.LA^LA 

C 

OPEN! UNIT=19 ,  FILE  = 'BOSIN 1  ) 

READ! IS,*  )  ANS 

10  READ! IS,*,  £NQ=999  1  N ,M , L ,D ,RG ,SEI ,SVS ,NEST ,NSR 
REAOI 19,*  )  YMIN,  YMAX 
READI 19,*  |  ( NE( I  1,1  =  1, LI 
REAOI 19,* I  18 

WRITE! 22,105  1  IB ,  I NE ( I  1 ,1  =  1 ,L ) 

105  FORMAT! 14,814 1 

REAOI 19,*  I  1X1. 1X2, 1X3, 1X4 
READ! 19,1151  Tl 
115  FORMAT!  A80) 

REAOI 19,1151  T2 

READ! 19,1151  T3 

READ!  19,*  I  (PHEANII  1,1  =  1, 3  1 

RE AO! 19,*  1  (AMSECI J  ),J=1,3 I 

C  CALL  FOR  SMTB10 :  PRODUCES  BOX-PLOT  AND  COMPARISON  OF  STATISTICS 

CALL  SMTBlOdXl, 1X2, 1X3,  Y,N,M,NE,L,D, NSR,  RG,SEI,SVS,YMIN,  YMAX, 

*  NEST,  NORML  .XMEAN ,T1 ,NORML , MEDIA, T2, NORML , TRIMM,  f3, 

*  PMEAN, AMSEC  I 
GO  TO  10 

999  WRITE! 6,*  1  'END  OF  DATA  INPUT' 

STOP 

END 

C  DATA  GENERATION  SECTION 

SUBROUTINE  EXPON! IX, X, NEK  1 
REAL  X! X  I 

IF! NEK  .LE.  0)  RETURN 
—  CALL  SEXPN! IX, X, NEK, 1,0 1 
RETURN 
END 
C 

SUBROUTINE  LAPLA! IX, X, NEK  1 
INTEGER  ISEED 

REAL  X! 1  I, XU! 1000  I  ,X2( 1000  1 
IF! NEK . LE .0  1  RETURN 
CALL  SEXPN! IX, X2, NEK, 1,0  I 
CALL  SEXPN! IX, XU, NEK, 1,0) 

DO  10  1=1, NEK 

X!  I  !=X2(  I  l-XU!  I  1 
10  CONTINUE 
RETURN 
END 
C 

SUBROUTINE  UNIFO! IX, X, NEK  1 
REAL  XI  1 1 

IF! NEK  .LE.  01  RETURN 
CALL  SRND! IX ,X, NEK ,1 ,0 1 
RETURN 
END 
C 

SUBROUTINE  NORML! IX, X,NEK  1 
REAL  X! 1  1 

IF! NEK  .LE.  01  RETURN 
CALL  SNOR!  IX, X ,NEK  ,1,0) 

RETURN 

END 

C 

SUBROUTINE  GAMAF! IX, X, NEK  1 
REAL  X! 1  I,  ALPHA 
ALPHA=0 . 5 

IF! NEK  .LE.  0)  RETURN 

CALL  SGAMAIIX.X, NEK, 1,0, ALFA) 

RETURN 

END 

C 

SUBROUTINE  POISF! IX, X, NEK ) 

REAL  X! 1  I , LAMOA 
LAMDA=0 . 5 

IF!  NEK  .LE.  0)  RETURN 

CALL  SPOISIIX.X, NEK, 1,0, LAMOA) 


vvy 


1mW! 


S-vy.-.y 


no  ooooo 


RETURN 

END 

SUBROUTINE  GEOMF! IX, X, NEK ) 

REAL  XII),  P 
P=0.5 

IFINEK  .LE.  0  1  RETURN 
CALL  SGEOM! IX,X,NEK,1 ,0»P ) 

RETURN 

END 

ESTIMATOR  SECTION  :  BRLG  IS  USED  FOR  LINEAR  REGRESSION  ESTIMATION 
ONLY.  IT  IS  RECOMMENDED  TO  USE  THIS  ESTIMATOR  SEPARETLY:  I.E, 

WHEN  CALLING  SMTB10,  USE  ONLY  ONE  ESTIMATOR. 

REAL  FUNCTION  BLREG! YOBS, NEK, HI ) 

COMMON  IB.ANS 

REAL  Y03SI 1  I  .BMSTAR! 3 > ,MSEBS 

REAL  XDESll 600 ,3  )  .XTRANSI 3,600  )  ,XDES2<  3,600 ) ,XTXINV<  3 ,3 ) 

REAL  RES1( 600  )  ,YHAT ( 600  ), RSTARI 600  )  ,BHAT( 3  ) ,  YSTAR! 600 ) 

REAL  BSTAR! 3 ) 

INTEGER  HI 
DO  10  1=1, NEK 
YHAT ( I  )=0 . 0 
DO  10  J=1 ,3 

XDESll 1 , J )  =  1.0 
XDES2I J,I  )  =  0.0 
XTRANSI  J, I  )=0.0 
10  CONTINUE 

DO  20  1  =  1, NEK, 2 
XDESll 1,2 )=-1.0 
20  CONTINUE 

DO  30  1=1, NEK, 4 

XDESll 1,3)  =  -1.0 
XDESll 1+1 ,3 )  =  -1.0 
30  CONTINUE 
DO  40  1=1,3 

XTXINVI 1 ,1  )=1 . 0/FLOATI NEK ) 

BHATII  )=0.0 
40  CONTINUE 

DO  50  J=1,NEK 
DO  50  1=1,3 

XTRANSI I , J  )=XDES1( J,I ) 

50  CONTINUE 
DO  60  K=1 ,3 

DO  60  J=1,NEK 
DO  60  1=1,3 

XDES2I K , J  )=XDES2( K , J  )  ♦  XTXINVI K ,1 )*XTRANS< I ,J  ) 

60  CONTINUE 
DO  70  K=1 ,3 

DO  70  J=1 ,NEK 

BH AT I K  I  =BHAT I K  )  +  XDES21 K , J )*YOBS< J  ) 

70  CONTINUE 

DO  90  J=1 ,NEK 
DO  80  1=1,3 

YH AT I J  )  =YHAT I J  )  ♦  XDESll  J ,1 )«BHAT( I  ) 

80  CONTINUE 

RES1I J  )=YOBS( J  l-YHATI J  ) 

90  CONTINUE 

DO  95  IWX=1 ,3 

BMSTARIIWX 1=0.0 
95  CONTINUE 
MSEBS=0 . 0 
DO  100  IW=1,IB 

DO  110  JI=1 ,NEK 

RSTARI JI  )=RES1( JI ) 

CONTINUE 

CALL  BOOTS! RSTAR, NEK) 

DO  120  K=1 ,NEK 

YSTARI  K)=YHAT(K)  +  RSTARI  K) 

CONTINUE 


110 


120 


DO  130  K=1 ,3 
BSTARI K  1=0 . 0 
DO  ' 


130  KI  =  1  ,NEK 

BSTARI K)=8HAT(K  I  ♦  XDES2IK.KI  )*RSTAR(KI ) 
130  CONTINUE 

WRITE! 6,5)  I BSTARI KL  )  ,KL  =  1 ,3 ) 

5  FORMAT! 3F8. 4) 

DO  140  KJ=1 ,3 

BMSTARIKJ  )=BMSTAR(KJ )  ♦  BSTARI KJ) 

140  CONTINUE 

100  CONTINUE 

DO  150  KH=1 ,3 

BMSTARI KH  )=BMSTARI KH l/FLOATI  IB ) 

150  CONTINUE 

DO  160  KI=1,3 

MSEBS=MSEBS+  BMSTARI KI  )«BMSTAR(  KI ) 

160  CONTINUE 

BLREG=MSEBS 

IF  I ANS . EQ . 1 . AND . HI . EQ . 1 )  WRITE! 21 ,102  )  BLREG 
102  FORMAT! F8. 4) 

RETURN 

END 

REAL  FUNCTION  XMEANI X,NEK ,WI ) 

COMMON  IB.ANS 


REAL  XI 1  I  ,Y( 1000  ),  V!  10 ) ,BB( 1000  ) 
INTEGER  WI 


DO  10  1  =  1,  NEK 
Y<  I  l=X(I) 

10  CONTINUE 

DO  15  1  =  1, IB 

DO  20  JI=1 ,NEK 
XI  JI  )=Y( JI  ) 

20  CONTINUE 

CALL  BOOTS! X, NEK  I 
CALL  BSTATS! X,NEK,V ) 
BBI I  )=  V!  1  ) 

15  CONTINUE 

CALL  BSTATS! 8B,IB,V) 
XMEAN=V< 1 ) 


IF( ANS.EQ. 1 . AND.WI .EQ. 1 )  WRITE ( 21,102 )  XMEAN 
FORMAT! F8. 4 ) 


102  FORMAT! 
RETURN 
END 


REAL  FUNCTION  VARIAIX.NEK.WI ) 

COMMON  IB.ANS 

REAL  X<1),  Yl  1000  ),V<  10  ),BB<  1000) 

INTEGER  WI 
00  10  1=1, NEK 
Y(  I )=X( I  ( 

10  CONTINUE 
00  15  1=1, IB 

00  20  JI=1,NEK 
X!  JI )=Y( JI  ) 

20  CONTINUE 

CALL  BOOTS!  X, NEK) 

CALL  BST ATS!  X, NEK ,V ) 

BB(  I  )=  V!  2  ) 

IS  CONTINUE 

CALL  BSTATS(BB,IB,V) 

VARIA=V( 1 ) 

IF! ANS, EQ. 1. ANO.WI .EQ. 1 )  WRITE! 21,102 )  VARIA 
102  FORMAT! F8. 4) 

RETURN 

END 

REAL  FUNCTION  VARI2! X,NEK ,WI  ) 

COMMON  IB,  ANS 

REAL  XII),  Y(  1000  ),V(  10  ),BB(  1000) 

INTEGER  WI 
DC  10  1=1.  NEK 
Y(  I )=X(  I ) 

10  CONTINUE 

DO  15  1=1, IB 

DO  20  JI=1,NEK 
X!  JI )=Y( JI ) 

20  CONTINUE 

CALL  BOOTS!  X,NEK) 

CALL  BST ATS! X, NEK, V) 

BBII  )=  VI  3) 

15  CONTINUE 

CALL  BST ATS! BB,IB , V ) 

VARI2=V( 1 ) 

IF  I ANS.EQ. 1 . AND .WI .EQ. 1 )  WRITE! 21 , 102  )  VARI2 
102  FORMAT (  F8.4 ) 

RETURN 

END 

REAL  FUNCTION  VARI3! X,NEK ,WI ) 

COMMON  IB,  ANS 

REAL  X!  1 ) ,  Y< 1000 ) , V! 10 ) ,BB! 1000 ) ,SMEAN,DNEK 

INTEGER  WI 

DNEK=NEK 

SMEAN=0.0 

DO  10  1=1, NEK 


10  1=1, NEK 
YI I )=X( I ) 
SMEAN=SMEAN+X(  I ) 


10  CONTINUE 

SMEAN=SMEAN/DNEK 
DO  15  1  =  1, IB 

DO  20  JI=1,0NEK 
X!  JI l=Y< JI  ) 

20  CONTINUE 

CALL  BOOTS! X, NEK) 

DO  30  JJ=1,NEK 

BB(  I  )=  BBII)  ♦  (  (X!  JJ)-SMEAN)**2) 

30  CONTINUE 

BBII )=BB( I  )/DNEK 
15  CONTINUE 

CALL  BST  ATS! BB , IB , V ) 

VARI3=V( 1 ) 

IF! ANS . EQ . 1 . AND .WI . EQ. 1 )  WRITE! 21 ,102  )  VARI3 
102  FORMAT! F8. 4) 

RETURN 

END 

REAL  FUNCTION  COEVA! X,NEK ,WI ) 

COMMON  IB.ANS 

REAL  X(l),  Y<  1000  ),V(  10  ),BBI  1000) 

INTEGER  WI 
DO  10  1=1, NEK 
YII  )=X(I) 

10  CONTINUE 
DO  15  1=1, IB 

DO  20  JI=1,NEK 
X!  JI )=Y< JI ) 

20  CONTINUE 

CALL  BOOTS! X, NEK) 

CALL  BSTATS! X,NEK,V ) 

BBII  )=  VI 4  ) 

15  CONTINUE 

CALL  BSTATS! BB, IB, V) 

COEVA=V( 1 ) 

IF! ANS.EQ. 1. AND. HI. EQ.l)  WRITE! 21,102  )  COEVA 
102  FORMAT! F8. 4) 

RETURN 

END 

REAL  FUNCTION  SECOR! X,NEK,WI ) 

COMMON  IB.ANS 

REAL  X! 1 ) .  Y( 1000 ),V( 10  ),BB< 1000) 

INTEGER  WI 
DO  10  1=1, NEK 
YII)=XII  ) 

10  CONTINUE 

DO  15  1=1, IB 

DO  20  JI  =  i  ,NEK 
XI  JI  )=Y( JI ) 

20  CONTINUE 


CALL  BOOTS! X, NEK) 

|§i-!E  ,tsTATS(X,NEK,vl 
15  CONTINUE 

SEC0R=va?,BB,IB,v* 

102  PORMAT!  PI ’ 4  • WI  •  EQ*  1 )  WRITE!  21,102  )  SECOR 

endurn 

BNkJ%3IS  MEDIA<*’”EK,WI> 

Integer1^  y  1 1000  ’  *' V(  10  ’ ,BB(  1000 1 

00  10  1=1, NEK 
Y  ( I  )=X(  I  ) 

10  CONTINUE 

DO  15  1=1, IB 

DO  20  JI=1,NEK 
„  X!  JI  )=YI JI ) 

20  CONTINUE 

CALL  BOOTS! X, NEK) 

CALL  BST ATS! X ,NEK ,V ) 

15  CONTINUE1"  V<61 

CM^^A^n?,BB’IB’V) 

102  PI • ^  )AND • WI * ■ 1 >  WRITE! 21,102)  MEDIA 

RETURN 

END 

commonuib!ans  TRIMM,X’NEK’WI> 

INTEGER1^  Y ‘ 1000  )  ’ 10  * ,BBI  1000  ) 

DO  10  1=1, NEK 
Y< I )=X(I ) 

10  CONTINUE 

DO  15  1=1, IB 

DO  20  JI=1,NEK 
„„  X! JI  )=Y( JI ) 

20  CONTINUE 

CALL  BOOTS! X, NEK) 

CALL  BSTATSl X,NEK >V ) 

BB(  I  )=  V!  7 ) 

15  CONTINUE 

CALL  BST ATS! BB,IB,V ) 

TRiiiM=V(  1 ) 

IFIANS.EO.l.AND.WI.Eq.l)  WRITE! 21,102  )  TRIMM 

102  FORMAT!  F8. 4) 

RETURN 

END 


BOOTSTRAP 


SECTION 


SUBROUTINE  BOOTS! X.NEK) 

COMMON  1X4 

REAL  X(l),  XBI1000),  XX!  1000  ) 

doLio^i=i^nekXB’NEK,2,°1 

B=*A*NEK 
M=INT( B >1 ) 

IFIM.GT.NEK )M=NEK 
.  XX!  I  )=X(  M  ) 

10  CONTINUE 

DO  20  1=1, NEK 
X(I)=XX(I) 

20  CONTINUE 
RETURN 
END 


STATISTICS 


SECTION 


SUBROUTINE  BST ATS! X ,NEK,V ) 

COMMON  IB 

REAL  XII).  V(10).  ZW( 5000 ) *ZT<  5000  )  *R>BMDIAN 
RE  AL*|^XTRIM^BTRIM^VARi  A2^ » D^  V ,  DyAR ,  VSTD ,  DNB  ,SCOR 

§|§!  VARIANCE> cv 
lnn  ^jglioS!  Sg T0 10 

100  RETURN* ,,SUBSAMPLE  SEIZE  IS  TOO  SMALL'  ,F6. 2  ) 

10  CONTINUE 
XMEAN=0.0 
DNB=NB 

DO  20  1=1, NB 
„„  XME AN=XMEAN+X(  I ) 

20  CONTINUE 

XMEAN=XMEAN/DNB 
V! 1  )=XMEAN 

TO  GENERATE  HIGHER  MOMENTS 
SUM 2  =  0.000 
SUMS  =  O.CDO 
SUM4  =  0.000 
00  30  1=1, NB 

DEV  =  X! I  )  -  XME AN 
SUM 2  =  SUM2  ♦  DEV  *«  2 
SUM 3  =  SUM 3  +  DEV  **  3 
SUM4  =  SUM4  +  DEV  **  4 
30  CONTINUE 

^SJF^.^^WS^^JT^STANOARD  DEVIATION. 
v('2R=dvarMZ  7  ,0NB  '  1-0D0' 

VSTD=DSQRT( DVAR ) 
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APPENDIX  C 

MSE*h  OF  SOME  ESTIMATORS  USING  THE  BOOTSTRAP  METHOD 


EST.  MSE 

Of  The 

Sample  Mean  Of 

*n  EXPCI) 

B/n 

10 

20 

25 

40 

50 

70 

100 

140 

5 

0.1213 

0.0544 

0.0531 

0.0309 

0.0257 

0.0216 

0.0142 

0.0118 

8 

0.1157 

0.0570 

0.0446 

0 . 0299 

0.0277 

0.0164 

0.0123 

0.0103 

10 

0.1131 

0.0551 

0.0453 

0.0288 

0.0247 

0.0170 

0.0134 

0 . 0097 

15 

0.1095 

0.0543 

0.0451 

0.0277 

0.0241 

0.0164 

0.0113 

0.0099 

20 

0.1064 

0.0528 

0.0432 

0.0262 

0.0252 

0.0163 

0.0131 

0.0096 

25 

0.1051 

0.0525 

0 . 0405 

0.0270 

0 . 0244 

0.0153 

0.0132 

0.0097 

40 

0.1022 

0.0508 

0.0417 

0.0277 

0.0245 

0.0162 

0.0122 

0.0087 

60 

0.1031 

0.0511 

0.0410 

0.0258 

0  .  0239 

0.0159 

0.0117 

0.0091 

100 

0.1030 

0.0512 

0.0420 

0.0252 

0.0244 

0.0155 

0.0119 

0.0090 

140 

0.1018 

0.0511 

0.0406 

0.0256 

0.0242 

0.0156 

0.0117 

0.0092 

500 

0.1007 

0.0471 

0.0368 

0.0217 

0.0202 

0.0119 

0.0101 

0.0041 

EST.  MSE 

Of  The 

Sample. Variance 

Of  An  EXP  Cl) 

5 

0.9130 

0.5313 

0.4114 

0.1690 

0.1703 

0.1120 

0.0745 

0.1363 

8 

0.7783 

0.4765 

0.4023 

0.1951 

0.1538 

0.1176 

0.0847 

0.0791 

10 

0.7776 

0.5418 

0.4485 

0.1703 

0.1461 

0.1393 

0.0680 

0.0800 

15 

0.6732 

0.5385 

0.3457 

0.1533 

0.1433 

0.1096 

0.0650 

0.0817 

20 

0.6408 

0.4589 

0.3447 

0.1562 

0.1373 

0.1043 

0.0662 

0.0852 

25 

0.7115 

0.4840 

0.3452 

0.1730 

0.1311 

0.0945 

0.0656 

0.0887 

40 

0.6822 

0.4692 

0.3392 

0.1556 

0.1349 

0.1179 

0.0635 

0.0808 

60 

0.6959 

0.4563 

0.3265 

0.1529 

0.1341 

0.1006 

0.0658 

0.0827 

100 

0 .6857 

0.4668 

0.3434 

0.1555 

0.1285 

0.1185 

0.0643 

0.0753 

140 

0.6789 

0.4714 

0.3259 

0.1565 

0.1280 

0.1069 

0.0592 

0.0733 

500 

0.6649 

0.4603 

0.3035 

0.1429 

0.1098 

0.0937 

0.0394 

0.0563 

EST. 

MSE  Of 

The  Sample  Coeff 

.  of  Variation  Of  An  EXPC1 ) 

5 

0.0667 

0.0391 

0.0285 

0.0238 

0.0183 

0.0144 

0.0090 

0.0080 

8 

0.0618 

0.0352 

0.0299 

0 . 0249 

0.0160 

0.0156 

0.0079 

0 . 0080 

10 

0.0618 

0.0340 

0.0269 

0.0218 

0.0169 

0.0126 

0.0084 

0 . 0080 

15 

0.0598 

0.0336 

0.0268 

0.0221 

0.0158 

0.0127 

0.0076 

0.0079 

20 

0.0599 

0.0313 

0.0263 

0.0218 

0.0156 

0.0133 

0.0077 

0.0068 

25 

0.0590 

0.0323 

0.0246 

0.0223 

0.0156 

0.0137 

0.0079 

0.0074 

40 

0.0584 

0.0309 

0.0255 

0 . 0208 

0.0153 

0.0120 

0.0073 

0.0071 

60 

0.0578 

0.0313 

0.0253 

0.0214 

0 .0154 

0.0127 

0 . 0078 

0.0070 

100 

0.0580 

0.0304 

0.0249 

0.0213 

0 . 0151 

0.0122 

0.0070 

0 . 0073 

140 

0.0573 

0 . 0308 

0.0252 

0.0215 

0.0147 

0.0123 

0.0074 

0.0074 

500 

0.0419 

0.0297 

0.0204 

0.0187 

0.0115 

0.0100 

0.0057 

0.0039 

Figured  MSE*h  of  the  Estimators  for  Exp(  I). 
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B/n 

5 

10 

15 

20 

25 

30 

50 

60 

5 

0.4213 

0.2045 

0.2934 

0.3217 

0.1813 

0.1527 

0.0790 

0.0565 

8 

0.4229 

0.1951 

0.2726 

0.2332 

0.1633 

0.1383 

0 . 0646 

0.0449 

10 

0.3397 

0.2134 

0.2294 

0.2195 

0.1672 

0.1376 

0.0704 

0.0417 

15 

0.3410 

0.1904 

0.2629 

0.1974 

0.1834 

0.1415 

0.0642 

0.0442 

20 

0.3668 

0.1975 

0.2420 

0.2365 

0.1647 

0.1467 

0.0676 

0.0430 

25 

0.3505 

0.1859 

0.2397 

0.2229 

0.1535 

0.1067 

0.0701 

0.0437 

30 

0.3792 

0.1851 

0.2446 

0.2307 

0.1580 

0.1196 

0.0743 

0.0449 

35 

0.3409 

0.1927 

0.2254 

0.2228 

0.1523 

0.1234 

0.0733 

0.0438 

40 

0.3465 

0.1896 

0.2453 

0.1988 

0.1623 

0.1215 

0.0672 

0.0426 

45 

0.3571 

0.1852 

0.2544 

0.2191 

0.1603 

0.1290 

0 . 0677 

0.0420 

50 

0.3678 

0.1888 

0.2405 

0.2318 

0.1478 

0.1191 

0.0693 

0.0439 

100 

0.3313 

0.1785 

0.2230 

0.2191 

0.1576 

0.1229 

0.0674 

0.0409 

500 

0.3165 

0.1582 

0.1341 

0.1217 

0.1117 

0.1095 

0.0441 

0.0287 

ESI 

'.  MSE  l 

Of  The  Sample  Variance  Of 

A  NC0,1) 

5 

0.4158 

0.2142 

0.1416 

0.1145 

0.0987 

0.0719 

0.0413 

0.0375 

8 

0.3841 

0.2049 

0.1363 

0.1005 

0.0970 

0.0701 

0.0490 

0.0271 

10 

0.3650  • 

0.1931 

0.1346 

0.1018 

0.0930 

0.0590 

0.0424 

0.0350 

15 

0.3687 

0.1948 

0.1332 

0.1008 

0.0853 

0.0633 

0.0444 

0.0356 

20 

0.3541 

0.1848 

0.1298 

0 . 0988 

0.0835 

0.0610 

0 . 0420 

0.0306 

25 

0.3712 

0.1870 

0.1225 

0.0948 

0 . 0848 

0.0674 

0.0398 

0.0304 

30 

0.3570 

0.1820 

0.1250 

0.0963 

0.0847 

0.0611 

0.0416 

0.0313 

35 

0.3632 

0.1869 

0.1266 

0.0925 

0.0850 

0.0623 

0.0399 

0.0297 

40 

0.3474 

0.1831 

0.1252 

0.0908 

0.0818 

0.0622 

0.0414 

0.0301 

45 

0.3595 

0.1839 

0.1223 

0.0924 

0.0809 

0.0640 

0.0408 

0.0306 

50 

0.3625 

0.1897 

0.1211 

0.0916 

0.0827 

0.0603 

0.0408 

0.0302 

100 

0.3644 

0.1611 

0.1132 

0.0841 

0.0806 

0.0619 

0.0412 

0.0300 

500 

0.3175 

0.1392 

0.1008 

0.0610 

0.0715 

0.0522 

0.0391 

0.0205 

EST 

'.  MES  l 

Of  The  Sample  Variance  Of 

A  L  ( 0 , 1 ) 

5 

2.9553 

2.3940 

1.5890 

1.0396 

0.8608 

0.7340 

0.5076 

0.4655 

8 

2.8503 

2.0733 

1.6019 

0.9700 

0.7033 

0.6355 

0.5318 

0.3749 

10 

2.7371 

2.0438 

1.6862 

0.9944 

0.7115 

0.7020 

0.4938 

0.4011 

15 

2.7377 

1.9280 

1.7109 

0 . 9290 

0 .7775 

0.6838 

0.4844 

0.3128 

20 

2.7954 

1.8716 

1.5557 

0.9623 

0.6811 

0.6798 

0.4974 

0.3277 

25 

2.6397 

1.8955 

1.5850 

0.9498 

0.7466 

0.6352 

0.4633 

0.3654 

30 

2.6941 

1.8366 

1.7492 

0.8812 

0.7106 

0.6430 

0.4849 

0.3270 

35 

2.7119 

1.8774 

1.5792 

0.8772 

0.7000 

0.6618 

0.4890 

0.3512 

40 

2.6518 

1.8689 

1.8452 

0.8875 

0.7028 

0.6250 

0.4785 

0.3479 

45 

2.6200 

1.8315 

1.6082 

0.9156 

0.7119 

0.5982 

0.4987 

0.3234 

50 

2.6419 

1.8801 

1.7016 

0.8712 

0.6749 

0.6377 

0.4652 

0.3489 

100 

2.6334 

1.8705 

1.4931 

0.8678 

0.6827 

0.6336 

0.4763 

0.3329 

500 

2.4163 

1.6915 

1.3852 

0.7542 

0.6173 

0.5918 

0.4258 

0.3039 

EST.  MSE  Of  Sample  Variance  of  a  N(0,1) 


B/n 

5 

10 

15 

20 

25 

30 

50 

60 

5 

0.4206 

0.2099 

0.1609 

0.1025 

0.0916 

0.0680 

0.0477 

0.0379 

8 

0.3855 

0.2032 

0.1294 

0.1084 

0.0875 

0.0702 

0.0474 

0.0316 

10 

0.3939 

0.1986 

0.1396 

0.0964 

0.0990 

0.0667 

0.0445 

0.0292 

15 

0.3743 

0.1942 

0.1344 

0.0961 

0.0842 

0.0658 

0 . 0398 

0.0325 

20 

0.3674 

0.1854 

0.1218 

0 . 0971 

0 . 0842 

0.0665 

0.0403 

0 . 0319 

25 

0.3589 

0.1898 

0.1313 

0.0968 

0.0859 

0.0619 

0.0408 

0.0312 

30 

0.3547 

0.1851 

0.1273 

0.0949 

0.0849 

0.0615 

0.0389 

0.0317 

35 

0.3647 

0.1861 

0.1242 

0.0949 

0.0819 

0.0622 

0 . 0422 

0.0310 

40 

0.3490 

0.1851 

0.1277 

0.0928 

0.0854 

0.0631 

0.0399 

0  .  0314 

45 

0.3568 

0.1871 

0.1231 

0.0915 

0 . 0857 

0 . 0632 

0 . 0389 

0.0298 

50 

0.3549 

0.1862 

0.1234 

0.0940 

0.0835 

0.0650 

0.0388 

0.0311 

EST. 

MSE  Of 

Sample  ' 

Variance 

(2nd  Estimator) 

of  NC0,1) 

5 

0.5810 

0.2467 

0.1537 

0.1091 

0.0908 

0.0879 

0.0557 

0.0384 

8 

0.5356 

0.2540 

0.1367 

0.1164 

0.0882 

0.0844 

0 .0630 

0.0368 

10 

0.5686 

.  0.2461 

0.1394 

0.1026 

0.0732 

0 . 0790 

0.0576 

0.0408 

15 

0.5387 

0.2304 

0.1398 

0.1067 

0.0812 

0.0685 

0.0573 

0.0369 

20 

0.5403 

0.2285 

0.1277 

0.104J 

0.0786 

0.0727 

0 . 0493 

0.0383 

25 

0.5198 

0.2204 

0.1322 

0.0989 

0.0784 

0.0754 

0.0530 

0.0340 

30 

0.5407 

0.2270 

0.1342 

0.1023 

0.0778 

0.0742 

0.0535 

0.0330 

35 

0.5355 

0.2249 

0.1313 

0.1005 

0.0782 

0.0740 

0.0531 

0.0347 

40 

0.5310 

0.2225 

0.1324 

0.1034 

0.0757 

0.0744 

0.0544 

0.0356 

45 

0.5166 

0.2261 

0.1312 

0.1036 

0.0775 

0.0713 

0.0518 

0.0362 

50 

0.5141 

0.2242 

0.1293 

0.0994 

0.0769 

0.0712 

0.0530 

0.0360 

EST. 

MSE  Of 

Sample 

Variance 

(3rd  Estimator) 

of  a  NC0 

,1) 

5 

0.3794 

0.1714 

0.1354 

0.1222 

0.0904 

0.0673 

0 . 0433 

0.0410 

8 

0.3518 

0.1706 

0.1349 

0.1173 

0 . 0768 

0.0612 

0.0453 

0.0363 

10 

0.3471 

0.1729 

0.1359 

0.1132 

0.0856 

0.0622 

0  .  0475 

0  .  0403 

15 

0.3356 

0.1542 

0.1275 

0.1055 

0.0750 

0.0578 

0.0433 

0.0364 

20 

0.3319 

0 . 1568 

0.1241 

0.1119 

0.0755 

0.0595 

0.0370 

0.0345 

25 

0.3243 

0.1615 

0.1256 

0.1089 

0.0782 

0.0563 

0 . 0409 

0 . 0332 

30 

0.3218 

0.1573 

0.1180 

0.1095 

0.0757 

0.0552 

0 . 0419 

0 . 0322 

35 

0.3244 

0.1576 

0.1218 

0.1034 

0.0787 

0.0553 

0 . 0428 

0.0320 

40 

0.3253 

0.1522 

0.1225 

0.1076 

0.0771 

0.0553 

0.0420 

0.0366 

45 

0.3200 

0.1573 

0.1232 

0.1056 

0 . 0758 

0 . 0565 

0.0407 

0.0351 

50 

0.3308 

0.1565 

0.1220 

0.1064 

0.0764 

0.0552 

0.0401 

0.0347 

Fir  :re  C.3  MSE*h  of  XS *2  ,  2S*Z  and  of  3S*2. 
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Figure  C. 5  Bootstrap  Dist.  of  Sample  Variance  B  =  1 50. 
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